add support for 64 block size on 32 warp size supported amd gpus #1748

electron271 · 2025-09-06T21:07:19Z

https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html most non instinct gpus support 32 warp size

tested on RX 9070 XT, looking into getting this tested on amd instinct accelerators to ensure gpus with 64 warp size still work

matthewdouglas · 2025-09-08T18:36:35Z

Thanks for the PR! I don't have the bandwidth to test this personally at the moment, so will defer to AMD team. Also I do not have any RDNA GPUs on hand.

cc: @pnunna93

github-actions · 2025-09-09T16:17:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

pnunna93

Thanks for the PR! It's good to go once warp size change is made.

csrc/ops.hip

matthewdouglas · 2025-10-03T13:21:36Z

Hi @electron271
There's still a couple conflicts, mostly because we removed all of the imports related to IPEX. If you don't mind fixing those I think we can merge after that! Thanks!

electron271 · 2025-10-20T18:51:15Z

will look through all this soon, sorry have been somewhat busy

Reuse BNB_WARP_SIZE macro

csrc/kernels.hip

csrc/ops.hip

matthewdouglas · 2025-10-27T16:35:09Z

Hi,
It looks like this breaks build compatibility for ROCm 6.1. I would be OK with dropping ROCm 6.1 compatibility if @pnunna93 agrees, but otherwise we would need to fix that build as well.

Apart from that, just a few linting issues to fix.

pnunna93 · 2025-10-28T16:16:46Z

Hi, It looks like this breaks build compatibility for ROCm 6.1. I would be OK with dropping ROCm 6.1 compatibility if @pnunna93 agrees, but otherwise we would need to fix that build as well.

Apart from that, just a few linting issues to fix.

I agree, we can deprecate 6.1 compatibility

matthewdouglas · 2025-10-28T20:37:18Z

I've opened #1788 which removes the ROCm 6.1 build.

sstamenk · 2025-11-05T19:01:01Z

Did some regression testing compared to the main branch on W7900 (gfx1100), R9700 (gfx1201) and MI300x (gfx942) using the rocm/vllm:latest Docker image. There don't seem to be any regressions. Out of the 804 newly enabled tests on gfx1100 and gfx1201, 156 fail due to accuracy issues while the other 648 pass. Attaching some logs:

W7900 (gfx1100)
- PR bitsandbytes_tests_gfx1100.log
R9700 (gfx1201)
- PR bitsandbytes_tests_gfx1201.log
- main bitsandbytes_tests_gfx1201_main.log
MI300x (gfx942)
- PR bitsandbytes_tests_gfx942.log
- main bitsandbytes_tests_gfx942_main.log

matthewdouglas · 2025-11-05T19:27:26Z

Thanks @sstamenk - that's quite useful! The failing tests seem to be mostly gemv with fp32. I think that's OK for now and can be addressed separately.

@electron271 If we fix the lint issues and merge conflict I'm happy to merge this in!

sstamenk · 2025-11-05T19:33:52Z

tests/test_functional.py

 import bitsandbytes as bnb
 from bitsandbytes import functional as F
-from bitsandbytes.cextension import HIP_ENVIRONMENT
+from bitsandbytes.cextension import HIP_ENVIRONMENT, ROCM_GPU_ARCH, ROCM_WARP_SIZE_64


ROCM_GPU_ARCH is unused, can be removed

sstamenk · 2025-11-05T19:37:47Z

bitsandbytes/cextension.py



 ROCM_GPU_ARCH = get_rocm_gpu_arch()
+ROCM_WARP_SIZE_64 = True if get_rocm_warpsize() == 64 else False


Should we rename ROCM_WARP_SIZE_64 and get_rocm_warpsize() to something generic like WARP_SIZE_64 and get_warpsize() since it technically covers both the cases for HIP and CUDA? Would also make more sense for the unit test skip conditions. @matthewdouglas

electron271 added 2 commits September 6, 2025 00:28

add support for 64 block size on 32 warp size supported amd gpus

d607127

uncomment 64 block size support in csrc

f7b4430

electron271 mentioned this pull request Sep 6, 2025

ROCM support unslothai/unsloth#3279

Open

only enable 64 block size support on architectures with 32 warp size

6e2e4d2

matthewdouglas added the ROCm label Sep 8, 2025

pnunna93 suggested changes Sep 24, 2025

View reviewed changes

csrc/ops.hip Outdated Show resolved Hide resolved

use BNB_WARP_SIZE instead of warpSize in ops.hip

8c24b4d

matthewdouglas previously approved these changes Oct 3, 2025

View reviewed changes

matthewdouglas added this to the v0.49.0 milestone Oct 3, 2025

Reuse BNB_WARP_SIZE macro

7dd7b88

sstamenk mentioned this pull request Oct 20, 2025

Reuse BNB_WARP_SIZE macro electron271/bitsandbytes#1

Merged

Merge pull request #1 from sstamenk/reuse_bnb_warp_size_macro

978eccf

Reuse BNB_WARP_SIZE macro

electron271 dismissed matthewdouglas’s stale review via 978eccf October 20, 2025 21:30

Merge branch 'main' into main

1c94701

electron271 requested a review from matthewdouglas October 20, 2025 21:37

sstamenk reviewed Oct 20, 2025

View reviewed changes

csrc/kernels.hip Outdated Show resolved Hide resolved

sstamenk reviewed Oct 20, 2025

View reviewed changes

csrc/ops.hip Outdated Show resolved Hide resolved

sstamenk mentioned this pull request Oct 21, 2025

Enable bitsandbytes quantization on AMD GPUs that use warp size 32 vllm-project/vllm#27307

Draft

5 tasks

Remove unused WARP_SIZE definitions

e9f0af3

electron271 requested a review from pnunna93 October 24, 2025 01:32

Merge branch 'main' into main

ca63206

sstamenk reviewed Nov 5, 2025

View reviewed changes



		ROCM_GPU_ARCH = get_rocm_gpu_arch()
		ROCM_WARP_SIZE_64 = True if get_rocm_warpsize() == 64 else False

Uh oh!

add support for 64 block size on 32 warp size supported amd gpus #1748

Are you sure you want to change the base?

add support for 64 block size on 32 warp size supported amd gpus #1748

Conversation

electron271 commented Sep 6, 2025

Uh oh!

matthewdouglas commented Sep 8, 2025

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

pnunna93 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

matthewdouglas commented Oct 3, 2025

Uh oh!

electron271 commented Oct 20, 2025

Uh oh!

Uh oh!

Uh oh!

matthewdouglas commented Oct 27, 2025

Uh oh!

pnunna93 commented Oct 28, 2025

Uh oh!

matthewdouglas commented Oct 28, 2025

Uh oh!

sstamenk commented Nov 5, 2025

Uh oh!

matthewdouglas commented Nov 5, 2025

Uh oh!

sstamenk Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

sstamenk Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants